Reset

Mon Nov 17th Tue Nov 18th

08:30 ~ 09:00
- Registration & Coffee~ Hall & Espacio Buñuel ~
09:00 ~ 09:30
- Sala 5
  - Business & Technical
  Spark, NoSQL & real time processing in Big Data
  
  English
  
  Óscar MéndezCEO Stratio and Paradigma Tecnológico
  
  Big Data Spain welcomes everyone. The event will be a two-day intense conference. We will meet many people, learn a lot and have fun.
09:30 ~ 10:15
- Sala 5
  - Business & Technical
  Apache Spark and OSS technologies by Paco Nathan @ Databricks
  
  English
  
  Paco NathanDirector of Community Evangelism Databricks
  
  How does Apache Spark fit within the landscape of Big Data technologies? These technologies have been changing abruptly, and this talk explores where that appears to be headed.
  
  On the one hand, the economics of datacenter technologies has shifted toward warehouse scale with commodity hardware that implies multicore and large memory spaces. Incumbent technologies do not embrace those changes, while Spark and related OSS projects work in concert to leverage them. On the other hand the shape of the data requirements are changing abruptly with sensor data, microsatellites, and other IoT use cases boosting data rates by orders of magnitude. We have decades-old advanced math techniques available to address these industrial needs, but how will our software frameworks keep pace?
  
  This talk addresses the effective integration of newer OSS technologies for the technical audience, while providing guidance to the business audience to understand fundamental drivers for these changes — how the use cases for Big Data are extending. We will consider the roles played by functional programming (making complex workflows tractable), by cloud-based notebooks (a new wave of flexibility and collaboration), as well as how some of the advanced math has enormous implications on real-time analytics at scale.
10:15 ~ 11:00
- Sala 5
  - Business & Technical
  Data warehouse modernization programme by IBM
  
  AnalyticsEnglish
  
  Toby WoolfeBig Data Solutions Leader IBM
  
  General Motors (GM) is in the process of constructing a single global information warehouse that will become the foundation for all business analytics and decision support across the enterprise.
11:00 ~ 11:30
- Coffee Break~ Espacio Buñuel ~
11:30 ~ 12:15
- Sala 5
  - Business
  MongoDB for your Big Data strategy by Norberto Leite
  
  MongoDBNoSQLEnglish
  
  Norberto LeiteSolutions Architect | Eng mongoDB Inc
  
  When one starts analysing the BigData technology spectrum we can find several different solutions for several different purposes. This is may cause confusion, uncertainty and doubts on what to chose and what for. Both on technical and business decision makers. This talk is to shed some light on where you should consider MongoDB for your BigData strategy and how to make the most out of the dominant technologies of the field.
- Sala 4
  - Technical
  Hue for Hadoop integrates with Impala & Spark by Cloudera
  
  HadoopEnglish
  
  Enrico BertiUI Engineer Cloudera's Hue
  
  Open up big data to your user base! Hadoop brings many data crunching possibilities but also comes with a lot of complexity.
12:15 ~ 13:00
- Sala 5
  - Business
  Trafodion SQL-on-HBase for transactional workloads
  
  HadoopEnglish
  
  Rodrigo MerinoSenior Presales Solution Architect Hewlett-Packard
  
  - Opensource project for transactional SQL database capabilities on Hadoop - Trafodion extends Hadoop to provide guaranteed transactional integrity, enabling new kinds of big data applications to run on Hadoop.
- Sala 4
  - Technical
  Storing and processing data in Hadoop by Jacek Juraszek
  
  HadoopEnglish
  
  Jacek JuraszekExpert Java/Hadoop Grupa AllegroJarosław GrabowskiSenior Java/Hadoop Developer Grupa Allegro
  
  It is a fact that hadoop is not "The ultimate solution". Data processing is hard task but switching to big data where there are much more scaling issues is even harder. Most classic approaches are not taking into account data passing around the network and therefore it is not enough in new landscape.
13:00 ~ 13:15
- Short Break~ Espacio Buñuel ~
13:15 ~ 14:00
- Sala 5
  - Business
  APIs, IoTs, big data, analytics and cognitive computing
  
  AnalyticsIoTEnglish
  
  Andy ThuraiTechnologist IBM
  
  The birth of a sophisticated Internet of Things has catapulted hybrid data collection, which mixes structured and unstructured data, to new heights. The goal with any analytics software is to find and improve better data sets rather than spending time in identifying, prepping, cleaning, and preparing the data.
- Sala 4
  - Workshop
  Introduction to Stratio Streaming
  
  StreamingSpanish
  
  David MoralesBig Data Architect StratioAntonio NavarroBig Data Developer Stratio
  
  Nowadays data-intensive processes and organizations of all sorts require the use of real-time data with increasing flexibility and complexity. We created Stratio Streaming to meet this demand.
14:00 ~ 14:45
- Lunch Break~ Espacio Buñuel ~
14:45 ~ 15:15
- Sala 5
  - Technical
  Round table with Databricks and Stratio at Big Data Spain
  
  SparkEnglish
  
  Apache Spark is taking the world of Big Data by storm. Which new use cases does Spark enable? What can we learn and use from the legacy of the Hadoop-centric ecosystem of the last few years?
15:15 ~ 16:00
- Sala 5
  - Technical
  Next-Generation NoSQL Data Stores – HyperDex
  
  English
  
  Emin Gün SirerFounder Hyperdex Hyperdex
  
  Distributed key-value stores are now a standard component of high-performance web services and cloud computing applications. While key-value stores offer significant performance and scalability advantages, the first wave of NoSQL stores typically compromise on consistency, fault-tolerance, performance, or functionality, and sometimes on all four.
  
  This talk will present HyperDex, a novel, open-source, distributed key-value store developed by my group that provides (1) strong consistency guarantees, (2) fault-tolerance for failures and partitions affecting up to f nodes, and (3) a rich API which includes ACID transactions and a unique search primitive that enables queries on secondary attributes. HyperDex achieves these properties through the combination of three recent technical advances called hyperspace hashing, value-dependent chaining and linear transactions. Despite offering stronger guarantees than first-gen NoSQL data stores, HyperDex is also a factor of 2-13 faster than Cassandra and MongoDB.
  
  This talk will outline these techniques, identify new research directions, and discuss how these breakthroughs relate to the oft-quoted, but mostly misunderstood, CAP credo.
  
  Bio:
  Emin Gun Sirer is an Associate Professor of Computer Science at Cornell University. His current research focuses on infrastructure services for large-scale distributed systems, such as key-value stores, graph databases, and consensus protocols. He is also interested in self-organizing systems and cryptocurrencies.
- Sala 4
  - Workshop
  Google Cloud Platform to predict football matches
  
  MLEnglish
  
  Jordan TiganiGoogle Software Developer - BigQuery Google
  
  Predict the future with machine learning over sports data. Open source tools to DIY.
16:00 ~ 16:45
- Sala 5
  - Technical
  ToroDB a new NoSQL database that replaces mongoDB
  
  NoSQLEnglish
  
  Álvaro HernándezCEO at WizzBill 8Kdata
  
  Relational databases are not slow or inadequate for NoSQL and/or BigData tasks: they can rather be the building blocks of them.
- Sala 4
  - Workshop
  Setup a scalable MongoDB Cluster with MMS
  
  MongoDBNoSQLSpanish
  
  Norberto LeiteSolutions Architect | Eng mongoDB Inc
  
  Come and learn how we can easily setup a large scalable and monitored MongoDB cluster using MMS. Lesson to be learn: Setup a cluster on MMS; Learn what and when to monitor; How to upgrade a cluster without down time on a couple of clicks.
16:45 ~ 17:00
- Drinks Break~ Espacio Buñuel ~
17:00 ~ 17:45
- Sala 5
  - Technical
  Real time analytics with MapReduce and in-memory
  
  AnalyticsEnglish
  
  Dr. William L. BainCEO at ScaleOut Software, Inc. ScaleOut Software, Inc.
  
  Operational intelligence represents an important new step in the evolution of data analytics by integrating analytics into live systems to provide immediate feedback and identify emerging patterns. Object-oriented, in-memory models of real-world systems enable their behavior to be tracked and analyzed in real time using data-parallel computing techniques. This technique builds on the technology of Hadoop MapReduce but has important differences which enable real-time analysis of live data.
- Sala 4
  - Business
  Hands on Machine Learning for a Business audience
  
  MLEnglish
  
  David GersterChief Data Scientist BigML
  
  Hands on Machine Learning for a Business audience
17:45 ~ 18:30
- Sala 5
  - Technical
  Large-scale graphs with Google(TM) Pregel
  
  English
  
  Michael HacksteinFront End and Graph Specialist ArangoDB
  
  - Gives rules of thumb to decide if Pregel is useful for the attendees use-case - Shows which criteria the attendee has to keep an eye on when making a technology decision - Shows some tips & tricks when implementing a Pregel algorithm
- Sala 4
  - Workshop
  BigInsights and streams: IBM Hadoop solution
  
  AnalyticsSpanish
  
  Luis ReinaData Specialist IBM
  In this workshop Luis Reina will show 2 Tools that comes with IBM BigInsights:
  
  1) BigSheets is a way to generate Hadoop Applicaitons (map/reduce) without programming so the final user can analyze big data without the need of knowing Java or other Hadoop languages as PIG. Bigsheets is a browser-based tool that is included in the InfoSphere? BigInsights? Console, to analyze and visualize big data. BigSheets uses a spreadsheet-like interface that can model, filter, combine, and chart data collected from multiple sources, such as an application that collects social media data by crawling the Internet.
  
  2) BigSQL provides broad SQL support that is typical of commercial databases. You can issue queries using JDBC or ODBC drivers to access data that is stored in Hadoop, in the same way that you access databases from your enterprise applications. You can use the Big SQL server to execute standard SQL queries. Multiple queries can be executed concurrently.
  
  Big SQL provides support for large ad hoc queries by using MapReduce parallelism and point queries, which are low-latency queries that return information quickly to reduce response time and provide improved access to data.
  
  The IBM Hadoop edition enriches the standard Hadoop platform with high value features, such as BigSQL, Big Sheets, Text Analytics and others.
  
  Keywords: Hadoop, Big SQL, security
  
  Two takeways points of the session:
  
  1. BigInsights is made on standard Apache Hadoop platform
  
  2. BigInsights brings high value features on the standard platform
18:30 ~ 19:15
- Sala 5
  - Technical
  Benchmarking Big Data systems by Cloudera
  
  AnalyticsEnglish
  
  Yanpei ChenPerformance Engineering ClouderaGwen ShapiraSoftware Engineer Cloudera
  
  You should assess performance marketing claims for the technical rigor of their metrics and measurement methods. When running benchmarks, generating numbers is easy, understanding how to interpret the numbers you have is the real challenge. We will show you how to critically check your own benchmarks for common mistakes.
- Sala 4
  - Technical
  Geoquery massive amounts of HDFS data from Spark processes
  
  MLEnglish
  
  Marc PlanagumàLead data engineer and researcher BDigital
  
  This session aims to address the specific challenges of exploiting large amounts of geospatially enabled data by reviewing how researchers at BDigital Technology Centre have designed and implemented a stack for advanced Machine Learning on Urban Data and providing a way to geoquery massive amounts of HDFS data from Spark processes without hindering the overall system performance.
19:15 ~ 20:00
- Drinks~ Barra ~

08:30 ~ 08:55
- Coffee~ Espacio Buñuel ~
09:00 ~ 09:45
- Sala 5
  - Business & Technical
  Data Science: From Lab to Factory | Big Data Spain
  
  HadoopMLEnglish
  
  Sean OwenDirector of Data Science Cloudera
  
  Machine Learning is not new. Big Machine Learning is qualitatively different.
09:45 ~ 10:30
- Sala 5
  - Business & Technical
  BigQuery for Genomics by Felipe Hoffa at Google
  
  AnalyticsEnglish
  
  Felipe HoffaDeveloper Relations engineer BigQuery team Google
  How big is the human genome? What tools can be used to manage and understand it?
  
  It turns out that the same SQL powers that Google BigQuery makes available for general usage can be applied to genomics. In this session we'll introduce the basics of managing genomes with our favorite big data tools, drawing parallels with more traditional use cases like analyzing view logs.
  
  Takeaways:
  
  The same SQL constructs that help us understand the world, can help us understand the basic fabrics of life.
  
  Live demoes will highlight how we can leverage the latest in Google tools and services to accelerate data insights, bringing them from batch to real interactive time.
10:30 ~ 11:15
- Sala 5
  - Business & Technical
  Visual tools for Spark and Cassandra applications by Stratio
  
  NoSQLSparkEnglish
  
  Óscar MéndezCEO Stratio and Paradigma Tecnológico
  
  Keynote by Óscar Méndez
11:15 ~ 11:45
- Coffee Break~ Espacio Buñuel ~
11:45 ~ 12:30
- Sala 5
  - Business & Technical
  Internet of Things & Large scale Data Analysis by Amazon
  
  IoTEnglish
  
  Andreas ChatzakisAWS Solutions Architect Amazon Web Services UK Ltd
  
  This session describes how to build large-scale data collection and processing architectures on AWS, and shows how to use an Intel Galileo board to collect sensor data that is sent to backend processing services such as Amazon Kinesis or Amazon Redshift.
12:30 ~ 13:15
- Sala 5
  - Technical
  Graph use-cases with RDBMS and NOSQL stores by Jim Webber
  
  NoSQLEnglish
  
  Jim WebberChief Scientist Neo Technology
  
  The evolution of graphs as a primary pillar of the data movement, and will contrast graph use-cases with RDBMS and contemporary NOSQL stores, with a slight detour through distributed systems
- Sala 4
  - Workshop
  Machine Learning to predict low risk loans by BigML
  
  AnalyticsMLEnglish
  
  Poul PetersenCIO BigML
  
  Traditionally, analyzing big data with machine learning tools has been prohibitively complex and expensive. In this session you will see how BigML makes machine learning more accessible than ever thanks to it's well defined workflow, insightful visualizations, and fully featured REST API.
  
  Using only a browser, we will develop a system to predict low risk loans using the rich data available from Lending Club. Techniques applied will include dataset transformations, random decision forests, clustering, anomaly detection, batch predictions, evaluations and more.
  
  An IPython notebook will be provided that utilizes BigML’s API to easily repeat every step taken during the session.
  
  Requirements for attendees to follow along: • BigML account
13:15 ~ 13:30
- Short Break~ Espacio Buñuel ~
13:30 ~ 14:15
- Sala 5
  - Technical
  Sinfonier real-time analytics and cybersecurity by Telefonica
  
  AnalyticsStreamingEnglishCybersecurity
  
  Fran GómezSecurity Area Telefónica
  
  Never in the history of the world has so much information been available at our fingertips. Yet taking advantage of this ever-increasing amount of information is hampered by the lack of dynamic and user-friendly technologies that provide real-time processing capabilities, a serious handicap in a business where time is always critical.
- Sala 4
  - Workshop
  Introduction to Neo4j Workshop by Jim Webber
  
  GraphDBEnglish
  
  Jim WebberChief Scientist Neo Technology
  
  In 45 short minutes, Neo4j's Chief Scientist Jim Webber will take you through the fundamentals of Graphs, Graph Modelling and Neo4j's Cypher query language, culminating in a live delivery of a realistic retail recommendations system.
  
  Come along to see how things that would take months with legacy data technology take minutes with graphs, and leave with a good basic grounding in how to repeat the approach in your data.
14:15 ~ 15:00
- Lunch Break~ Espacio Buñuel ~
15:00 ~ 15:30
- Sala 5
  - Business
  EMC Pivotal, Stratio and GFT Group round table BigDataSpain
  
  English
  
  In 2013, the conference Big Data Spain managed to convey the message that Big Data does not have to be the expensive, complicated and cumbersome dragon that corporations did not dare to deal with.
  
  As in all hype cycles, the meaning of buzzwords like Big Data often gets lost in translation. Are case studies the best way to explain Big Data to the top management and decision makers?
  
  How is the enterprise world adopting Big Data in 2015? Which are the good news and the challenges ahead?
15:30 ~ 16:15
- Sala 5
  - Business & Technical
  Analysis of tourists in Madrid & Barcelona | Big Data Spain
  
  AnalyticsEnglish
  
  Albert SolanaConsultant RocaSalvatella
  
  Present a new methodology for improved analysis and knowledge of the Spanish tourism industry with real data gathered from cellphones and credit card transactions. Differentiate what it means “to have the data” like Telefónica Móviles España or BBVA did, from “to analyze the data” like Telefónica I+D did, or “to know what specific questions to make to the data” like RocaSalvatella did.
- Sala 4
  - Workshop
  Stratio Crossdata: a SQL-like language for streaming queries
  
  NoSQLSparkSpanish
  
  Álvaro AgeaBig Data Architect StratioDaniel HigueroBig Data Architect Stratio
  
  Crossdata is a distributed peer-to-peer fault-tolerant framework that unifies the interaction with batch and streaming sources supporting multiple datastore technologies.
16:15 ~ 17:00
- Sala 5
  - Technical
  R with Hadoop for large-scale analytics
  
  AnalyticsEnglish
  
  Jose Luis LópezData Engineer GetYourGuide
  
  Data Analysis in large-scale is difficult and inaccessible for many people at the moment. This academic work presents a major step forward in this direction.
- Sala 4
  - Workshop
  ETL with Google BigQuery and Javascript by Thomas Park
  
  AnalyticsEnglish
  
  Felipe HoffaDeveloper Relations engineer BigQuery team GoogleThomas ParkSenior Software Engineer Google
  
  How to extend SQL with Javascript UDFs: User Defined Functions. Practical uses and how to run your Javascript functions on a cluster of hundreds or thousands of nodes.
17:00 ~ 17:15
- Drinks Break~ Espacio Buñuel ~
17:15 ~ 18:00
- Sala 5
  - Technical
  Analytics for ads servers by MediaSmart Mobile
  
  AnalyticsEnglish
  
  Alex FernándezSenior Developer MediaSmart Mobile
  
  We will discuss how we created a system to manipulate big data from scratch, using Amazon RedShift, in a constrained resources scenario.
- Sala 4
  - Workshop
  Kettle, an ETL tool and NoSQL databases
  
  NoSQLSpanish
  
  Ignacio BustilloComputer Scientist & DataConsultant U-tad
  
  Learn how to develop big data processes with drag and drop Learn how to store data in technologies such MongoDb or Cassandra with drap and drop and withour technical knowledge
18:00 ~ 18:45
- Sala 5
  - Technical
  Stateless dataflows MapReduce Spark
  
  MLEnglish
  
  Raúl Castro FernándezComputer Science PhD student Imperial College
  
  Stateful Dataflow Graphs (SDG) are a new dataflow representation that introduces 'state' explicitly in the dataflow. This talk includes reflections about this new programming model---its virtues and limitations---and a short discussion about a now long pursuit of “the Big Data Language”.
- Sala 4
  - Workshop
  Install HyperDex NoSQL clusters and API by Robert Escriva
  
  NoSQLEnglish
  
  Robert EscrivaCo-founder Hyperdex
  
  This workshop will provide a ground-up introduction to HyperDex. Topics demonstrated in this session include:
  
  - Installing HyperDex - Deploying a cluster - Exploring HyperDex's rich API - Scaling the cluster horizontally - Backing up and restoring a running cluster
  
  Keywords: NoSQL, Document Store, Key-Value Store
  
  Bio:
  Robert Escriva is the co-founder and chief architect at HyperDex, a next-generation data and document store that provides high performance, fault tolerance, and strong consistency guarantees. He is broadly interested in building infrastructure for large scale distributed systems
18:45 ~ 19:15
- Sala 5
  Closing Session
19:15 ~ 20:00
- Drinks~ Barra ~

* Subject to changes and adjustments

THANK YOU FOR AN AMAZING CONFERENCE!

FILTER BY TAGS

Spark, NoSQL & real time processing in Big Data

Apache Spark and OSS technologies by Paco Nathan @ Databricks

Data warehouse modernization programme by IBM

MongoDB for your Big Data strategy by Norberto Leite

Hue for Hadoop integrates with Impala & Spark by Cloudera

Trafodion SQL-on-HBase for transactional workloads

Storing and processing data in Hadoop by Jacek Juraszek

APIs, IoTs, big data, analytics and cognitive computing

Introduction to Stratio Streaming

Round table with Databricks and Stratio at Big Data Spain

Next-Generation NoSQL Data Stores – HyperDex

Google Cloud Platform to predict football matches

ToroDB a new NoSQL database that replaces mongoDB

Setup a scalable MongoDB Cluster with MMS

Real time analytics with MapReduce and in-memory

Hands on Machine Learning for a Business audience

Large-scale graphs with Google(TM) Pregel

BigInsights and streams: IBM Hadoop solution

Benchmarking Big Data systems by Cloudera

Geoquery massive amounts of HDFS data from Spark processes

Data Science: From Lab to Factory | Big Data Spain

BigQuery for Genomics by Felipe Hoffa at Google

Visual tools for Spark and Cassandra applications by Stratio

Internet of Things & Large scale Data Analysis by Amazon

Graph use-cases with RDBMS and NOSQL stores by Jim Webber

Machine Learning to predict low risk loans by BigML

Sinfonier real-time analytics and cybersecurity by Telefonica

Introduction to Neo4j Workshop by Jim Webber

EMC Pivotal, Stratio and GFT Group round table BigDataSpain

Analysis of tourists in Madrid & Barcelona | Big Data Spain

Stratio Crossdata: a SQL-like language for streaming queries

R with Hadoop for large-scale analytics

ETL with Google BigQuery and Javascript by Thomas Park

Analytics for ads servers by MediaSmart Mobile

Kettle, an ETL tool and NoSQL databases

Stateless dataflows MapReduce Spark

Install HyperDex NoSQL clusters and API by Robert Escriva

Closing Session

THANK YOU FOR AN AMAZING CONFERENCE!

FILTER BY TAGS

Closing Session

Join our Newsletter